A SIMD Vectorizing Compiler for Digital Signal Processing Algorithms
نویسندگان
چکیده
Short vector SIMD instructions on recent microprocessors, such as SSE on Pentium III and 4, speed up code but are a major challenge to software developers. We present a compiler that automatically generates C code enhanced with short vector instructions for digital signal processing (DSP) transforms, such as the fast Fourier transform (FFT). The input to our compiler is a concise mathematical description of a DSP algorithm in the language SPL. SPL is used in the SPIRAL system (http://www.ece.cmu.edu/ spiral) to generate highly optimized architecture adapted implementations of DSP transforms. Interfacing our compiler with SPIRAL yields speed-ups of more than a factor of 2 in several important cases including the FFT and the discrete cosine transform (DCT) used in the JPEG compression standard. For the FFT our automatically generated code is competitive with the hand-coded Intel Math Kernel Library.
منابع مشابه
Efficient Vectorization of the FIR Filter
The Finite Impulse Response (FIR) filter is one of the most important digital signal processing (DSP) kernels. It performs filtering of speech signals in modern voice coders such as the ETSI GSM EFR/AMR or ITU G.729, as well as in many other signal processing areas. Many contemporary digital signal processors as well as general-purpose microprocessors employ SIMD instructions to exploit the dat...
متن کاملShort Vector SIMD Code Generation for DSP Algorithms
Short vector SIMD instructions on recent general purpose microprocessors, such as SSE on Pentium III and 4, offer a high potential speed-up but require a very high level of programming expertise. We present a compiler that generates vectorized code for digital signal processing algorithms such as the fast Fourier transform (FFT). The input to our compiler is a mathematical description of the al...
متن کاملA Low-Power Multithreaded Processor for Baseband Communication Systems
Embedded digital signal processors for baseband communication systems have stringent design constraints including high computational bandwidth, low power consumption, and low interrupt latency. Furthermore, these processors should be compiler-friendly, so that code for them can quickly be developed in a high-level language. This paper presents the design of a highperformance, low-power digital ...
متن کاملEfficient Exploitation of Hyper Loop Parallelism in Vectorization
Modern processors can provide large amounts of processing power with vector SIMD units if the compiler or programmer can vectorize their code. With the advance of SIMD support in commodity processors, more and more advanced features are introduced, such as flexible SIMD lane-wise operations (e.g. blend instructions). However, existing vectorizing techniques fail to apply global SIMD lane-wise o...
متن کاملThe Sandblaster Automatic Multithreaded Vectorizing Compiler
Compilers for Digital Signal Processors (DSP) have been inefficient. The constraints have been two-fold. First, signal processing algorithms that use non-associative arithmetic are not easily described in high-level languages such as C, C++, and Java. Second, historical DSP architectures have been difficult compiler targets due to their non-orthogonal instruction sets. With modern DSP architect...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002